Exploring the Value of Supporting Multiple DSM Protocols in Hardware DSM Controllers
نویسندگان
چکیده
The performance of a hardware distributed shared memory (DSM) system is largely dependent on its architect’s ability to reduce the number of remote memory misses that occur. Previous attempts to solve this problem have included measures such as supporting both the CC-NUMA and S-COMA architectures in the same machine and providing a programmable DSM controller that can emulate any DSM mechanism. In this paper we first present the design of a DSM controller that supports multiple DSM protocols in custom hardware, and allows the programmer or compiler to specify on a per-variable basis what protocol to use to keep that variable coherent. The simulated performance of this DSM controller compares favorably with that of conventional single-protocol custom hardware designs, often outperforming the conventional systems by a factor of two. To achieve these promising results, the multi-protocol DSM controller needed to support only two DSM architectures (CC-NUMA and S-COMA) and three coherency protocols (both release and sequentially consistent write invalidate and release consistent write update). This work demonstrates the value of supporting a degree of flexibility in one’s DSM controller design and suggests what operations such a flexible DSM controller should support.
منابع مشابه
Providing Hardware Dsm Performance at Software Dsm Cost Providing Hardware Dsm Performance at Software Dsm Cost
Emerging trends in commodity network technology coupled with key insights from academic research in active memory systems are leading toward the realization of hardware DSM on commodity clusters. We call the result of this convergence active memory clusters. After discussing the current state of the art in hardware DSM, clusters, and software DSM architectures, we highlight the key di erences b...
متن کاملHardware Support for Flexible Distributed Shared Memory
Workstation-based parallel systems are attractive due to their low cost and competitive uniprocessor performance. However, supporting a cache-coherent global address space on these systems involves significant overheads. We examine two approaches to coping with these overheads. First, DSM-specific hardware can be added to the off-the-shelf component base to reduce overheads. Second, application...
متن کاملActive Memory Clusters: Efficient Multiprocessing on Next-Generation Servers
We show how key insights from our research into active memory systems, coupled with emerging trends in commodity network technology, are leading toward the realization of hardware distributed shared memory (DSM) on clusters of industry-standard workstations. We call the result of this convergence active memory clusters. After discussing the current state of the art in hardware DSM, clusters, an...
متن کاملExploiting the Benefits of Multiple-Path Network DSM Systems: Architectural Alternatives and Performance Evaluation
| Modern high performance networks being used for scalable distributed shared memory (DSM) systems support multiple paths to increase bandwidth and/or reduce contention. Such networks violate the constraint of pairwise in-order message delivery implicitly required by many existing directory-based cache coherence protocols. To solve this problem, two alternative strategies are currently used by ...
متن کاملFlexibility Implies Performance
No single coherence strategy suits all applications well. Many promising adaptive protocols and coherence predictors, capable of dynamically modifying the coherence strategy, have been suggested over the years. While most dynamic detection schemes rely on plentiful of dedicated hardware, the customization technique suggested in this paper requires no extra hardware support for its per-applicati...
متن کامل